model monitoring
Gini-based Model Monitoring: A General Framework with an Application to Non-life Insurance Pricing
In a dynamic landscape where portfolios and environments evolve, maintaining the accuracy of pricing models is critical. To the best of our knowledge, this is the first study to systematically examine concept drift in non-life insurance pricing. We (i) provide an overview of the relevant literature and commonly used methodologies, clarify the distinction between virtual drift and concept drift, and explain their implications for long-run model performance; (ii) review and formalize common performance measures, including the Gini index and deviance loss, and articulate their interpretation; (iii) derive the asymptotic distribution of the Gini index, enabling valid inference and hypothesis testing; and (iv) present a standardized monitoring procedure that indicates when refitting is warranted. We illustrate the framework using a modified real-world portfolio with induced concept drift and discuss practical considerations and pitfalls.
Position Paper: Rethinking AI/ML for Air Interface in Wireless Networks
Kontes, Georgios, Michalopoulos, Diomidis S., Ghimire, Birendra, Mutschler, Christopher
AI/ML research has predominantly been driven by domains such as computer vision, natural language processing, and video analysis. In contrast, the application of AI/ML to wireless networks, particularly at the air interface, remains in its early stages. Although there are emerging efforts to explore this intersection, fully realizing the potential of AI/ML in wireless communications requires a deep interdisciplinary understanding of both fields. We provide an overview of AI/ML-related discussions in 3GPP standardization, highlighting key use cases, architectural considerations, and technical requirements. We outline open research challenges and opportunities where academic and industrial communities can contribute to shaping the future of AI-enabled wireless systems.
Confidence-based Estimators for Predictive Performance in Model Monitoring
Kivimรคki, Juhani, Biaลek, Jakub, Nurminen, Jukka K., Kuberski, Wojtek
After a machine learning model has been deployed into production, its predictive performance needs to be monitored. Ideally, such monitoring can be carried out by comparing the model's predictions against ground truth labels. For this to be possible, the ground truth labels must be available relatively soon after inference. However, there are many use cases where ground truth labels are available only after a significant delay, or in the worst case, not at all. In such cases, directly monitoring the model's predictive performance is impossible. Recently, novel methods for estimating the predictive performance of a model when ground truth is unavailable have been developed. Many of these methods leverage model confidence or other uncertainty estimates and are experimentally compared against a naive baseline method, namely Average Confidence (AC), which estimates model accuracy as the average of confidence scores for a given set of predictions. However, until now the theoretical properties of the AC method have not been properly explored. In this paper, we try to fill this gap by reviewing the AC method and show that under certain general assumptions, it is an unbiased and consistent estimator of model accuracy with many desirable properties. We also compare this baseline estimator against some more complex estimators empirically and show that in many cases the AC method is able to beat the others, although the comparative quality of the different estimators is heavily case-dependent.
Best Practices For Machine Learning Model Monitoring - MarkTechPost
Model monitoring is the process of regularly evaluating, tracking, and auditing machine learning models. This process helps data science and machine learning teams identify any issues with their models and take appropriate action to address them. Through model monitoring, teams can ensure that their models are functioning optimally and meeting the needs of their users and stakeholders. The practice of monitoring ML model performance is crucial in the transition towards more reliable and unbiased AI systems. Monitoring ML models in both training and production allows for control over the product, early detection of issues, and immediate intervention when necessary.
A Human-Centric Take on Model Monitoring
Shergadwala, Murtuza N, Lakkaraju, Himabindu, Kenthapadi, Krishnaram
Predictive models are increasingly used to make various consequential decisions in high-stakes domains such as healthcare, finance, and policy. It becomes critical to ensure that these models make accurate predictions, are robust to shifts in the data, do not rely on spurious features, and do not unduly discriminate against minority groups. To this end, several approaches spanning various areas such as explainability, fairness, and robustness have been proposed in recent literature. Such approaches need to be human-centered as they cater to the understanding of the models to their users. However, there is a research gap in understanding the human-centric needs and challenges of monitoring machine learning (ML) models once they are deployed. To fill this gap, we conducted an interview study with 13 practitioners who have experience at the intersection of deploying ML models and engaging with customers spanning domains such as financial services, healthcare, hiring, online retail, computational advertising, and conversational assistants. We identified various human-centric challenges and requirements for model monitoring in real-world applications. Specifically, we found the need and the challenge for the model monitoring systems to clarify the impact of the monitoring observations on outcomes. Further, such insights must be actionable, robust, customizable for domain-specific use cases, and cognitively considerate to avoid information overload.
Google Revamps Its Cloud-Based Machine Learning Platform Yet Again
At the Google I/O event, the company has announced a revamped cloud-based machine learning platform branded as Vertex AI. It's unusual for Google to announce cloud-related services at Google I/O, a forum for launching consumer and developer technologies. Since Cloud Next - the flagship cloud user conference - is postponed to October, Vertex AI found its place in I/O announcements. This is the third iteration of the Google Cloud ML platform since its original launch. What prompted Google to launch Vertex AI? Let's find out.
How the COVID-19 Pandemic is Accelerating the Need for Model Monitoring
Data models that predate the pandemic may not reflect today's business environment. It's time to give models a checkup to make sure they reflect current conditions. It's no secret that the COVID-19 pandemic has had an impact on nearly every facet of business operations, and organizations that depend on artificial intelligence (AI) and machine learning (ML) to automate business decisions and critical business processes have been particularly vulnerable. Thanks to dramatic changes in both the overall economic environment as well as specific consumer behaviors since the onset of the pandemic, AI/ML models in organizations of all sizes and in every industry have been rendered largely ineffective because the pre-pandemic data on which the models were trained is no longer relevant or predictive of current behavior. Once in production, a model's behavior can change if production data diverges from the data used to train it.